Search CORE

16 research outputs found

Superresolution without Separation

Author: Recht Benjamin
Robeva Elina
Schiebinger Geoffrey
Publication venue
Publication date: 13/08/2015
Field of study

This paper provides a theoretical analysis of diffraction-limited superresolution, demonstrating that arbitrarily close point sources can be resolved in ideal situations. Precisely, we assume that the incoming signal is a linear combination of M shifted copies of a known waveform with unknown shifts and amplitudes, and one only observes a finite collection of evaluations of this signal. We characterize properties of the base waveform such that the exact translations and amplitudes can be recovered from 2M + 1 observations. This recovery is achieved by solving a a weighted version of basis pursuit over a continuous dictionary. Our methods combine classical polynomial interpolation techniques with contemporary tools from compressed sensing.Comment: 23 pages, 8 figure

arXiv.org e-Print Archive

Crossref

The geometry of kernelized spectral clustering

Author: Schiebinger Geoffrey
Wainwright Martin J.
Yu Bin
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2014
Field of study

Clustering of data sets is a standard problem in many areas of science and engineering. The method of spectral clustering is based on embedding the data set using a kernel function, and using the top eigenvectors of the normalized Laplacian to recover the connected components. We study the performance of spectral clustering in recovering the latent labels of i.i.d. samples from a finite mixture of nonparametric distributions. The difficulty of this label recovery problem depends on the overlap between mixture components and how easily a mixture component is divided into two nonoverlapping components. When the overlap is small compared to the indivisibility of the mixture components, the principal eigenspace of the population-level normalized Laplacian operator is approximately spanned by the square-root kernelized component densities. In the finite sample setting, and under the same assumption, embedded samples from different components are approximately orthogonal with high probability when the sample size is large. As a corollary we control the fraction of samples mislabeled by spectral clustering under finite mixtures with nonparametric components.Comment: Published at http://dx.doi.org/10.1214/14-AOS1283 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

eScholarship - University of California

Learning Developmental Landscapes from Single Cell Gene Expression with Optimal Transport

Author: Schiebinger Geoffrey
Publication venue: Banff International Research Station for Mathematical Innovation and Discovery
Publication date: 28/04/2018
Field of study

Non UBCUnreviewedAuthor affiliation: MITPostdoctora

University of British Columbia: cIRcle - UBC's Information Repository

Sparse Inverse Problems: The Mathematics of Precision Measurement

Author: Schiebinger Geoffrey
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

The interplay between theory and experiment is the key to progress in the natural sciences. This thesis develops the mathematics of distilling knowledge from measurement. Specifically, we consider the inverse problem of recovering the input to a measurement apparatus from the observed output. We present separate analyses for two different models of input signals. The first setup is superresolution. Here, the input is a collection of continuously parameterized sources, and we observe a weighted superposition of signals from all of the sources. The second setup is unsupervised classification. The input is a collection of categories, and the output is an unlabeled set of objects from the different categories. In Chapter 1 we introduce these measurement modalities in greater detail and place them in a common framework.Chapter 2 provides a theoretical analysis of diffraction-limited superresolution, demonstrating that arbitrarily close point sources can be resolved in ideal situations. Precisely, we assume that the incoming signal is a linear combination of M shifted copies of a known waveform with unknown shifts and amplitudes, and one only observes a finite collection of evaluations of this signal. We characterize properties of the base waveform such that the exact translations and amplitudes can be recovered from 2M + 1 observations. This recovery can be achieved by solving a weighted version of basis pursuit over a continuous dictionary. Our analysis shows that l1-based methods enjoy the same separation-free recovery guarantees as polynomial root finding techniques such as Prony’s method or Vetterli’s method for signals of finite rate of innovation. Our proof techniques combine classical polynomial interpolation techniques with contemporary tools from compressed sensing.In Chapter 3 we propose a variant of the classical conditional gradient method (CGM) for superresolution problems with differentiable measurement models. Our algorithm combines nonconvex and convex optimization techniques: we propose global conditional gradient steps alternating with nonconvex local search exploiting the differentiable observation model. This hybridization gives the theoretical global optimality guarantees and stopping conditions of convex optimization along with the performance and modeling flexibility associated with nonconvex optimization. Our experiments demonstrate that our technique achieves state- of-the-art results in several applications.Chapter 4 focuses on unsupervised classification. Clustering of data sets is a standard problem in many areas of science and engineering. The method of spectral clustering is based on embedding the data set using a kernel function, and using the top eigenvectors of the normalized Laplacian to recover the connected components. We study the performance of spectral clustering in recovering the latent labels of i.i.d. samples from a finite mixture of nonparametric distributions. The difficulty of this label recovery problem depends on the overlap between mixture components and how easily a mixture component is divided into two nonoverlapping components. When the overlap is small compared to the indivisibility of the mixture components, the principal eigenspace of the population-level normalized Laplacian operator is approximately spanned by the square-root kernelized component densities. In the finite sample setting, and under the same assumption, embedded samples from different components are approximately orthogonal with high probability when the sample size is large. As a corollary we control the fraction of samples mislabeled by spectral clustering under finite mixtures with nonparametric components

Ezid

eScholarship - University of California

The geometry of kernelized spectral clustering

Author: Schiebinger Geoffrey,
Publication venue
Publication date: 02/12/2019
Field of study

Ezid

Trajectory Inference via Mean-field Langevin in Path Space

Author: Chizat Lénaïc
Heitz Matthieu
Schiebinger Geoffrey
Zhang Stephen
Publication venue
Publication date: 12/06/2022
Field of study

Trajectory inference aims at recovering the dynamics of a population from snapshots of its temporal marginals. To solve this task, a min-entropy estimator relative to the Wiener measure in path space was introduced by Lavenant et al. arXiv:2102.09204, and shown to consistently recover the dynamics of a large class of drift-diffusion processes from the solution of an infinite dimensional convex optimization problem. In this paper, we introduce a grid-free algorithm to compute this estimator. Our method consists in a family of point clouds (one per snapshot) coupled via Schr\"odinger bridges which evolve with noisy gradient descent. We study the mean-field limit of the dynamics and prove its global convergence to the desired estimator. Overall, this leads to an inference method with end-to-end theoretical guarantees that solves an interpretable model for trajectory inference. We also present how to adapt the method to deal with mass variations, a useful extension when dealing with single cell RNA-sequencing data where cells can branch and die

arXiv.org e-Print Archive